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Abstract This study conducts a comprehensive analysis of time series segmentation 
on the Japanese stock prices listed on the first section of the Tokyo Stock Exchange 
during the period from 4 January 2000 to 30 January 2012. A recursive segmentation 
procedure is used under the assumption of a Gaussian mixture. The daily number 
of each quintile of volatilities for all the segments is investigated empirically. It is 
found that from June 2004 to June 2007, a large majority of stocks are stable and 
that from 2008 several stocks showed instability. On March 201 1, the daily number 
of instable securities steeply increased due to societal turmoil influenced by the East 
Japan Great Earthquake. It is concluded that the number of stocks included in each 
quintile of volatilities provides useful information on macroeconomic situations. 

Key words: Tokyo Stock Exchange, Likelihood-ratio test, chi-squared distribution, 
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1 Introduction 

Recently, interests in a relationship between price movements and events or news 
seem to increase [?, ?, ?]. The traded prices are assumed to be signals resulting from 
both exogenous and endogenous factors. 

Agent-based models of financial markets have been developed for the last decade [?, 
?, ?, ?, ?, ?]. Normally buyers and sellers are assumed in the model. A market in- 
cluding several agents is considered. Fundamentalists, chartists, and noise traders 
may interplay in financial markets. The fundamentalists know actual value of stock 
but the actual value fluctuates in time. The chartists watch price movements and de- 
termine their prices based on the price movements. The chartists are classified into 
trend followers and contrarians. The noise traders determine their trading prices 
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randomly. As modelled above, the financial market consists of various types of par- 
ticipants is a kind of agent-based information processing systems embedded in the 
real world. 

In the last two decades, statistical properties of asset price returns have been 
successively studies in the literature of econophysics [fl~||2]. One of the important 
properties is that the probability distribution of returns exhibits a fat-tailed distri- 
bution (Hill. Several researchers reported that the tail distributions of log returns 
have the power-law and are well fitted by Student's f-distributions [|5) or g-Gaussian 
distributions founded in nonextensive statistical thermo dynamics |6]|7]. The Beck 
model introduced as a dynamical foundation of Tsallis statistics Although it 
is originally intended to describe mechanical systems such as turbulence, its ba- 
sic idea of fluctuating temperature is well consistent with the heteroscedasticity of 
markets. Thus, in recent studies, it is employed to elucidate price fluctuations in 
markets (9] [TO) and it is called superstatistics ifTTl . Moreover, the boom-bust cy- 
cle is associated with the existence of bubbles in stock markets. This cycle may be 
quantified by collective behavior of stock prices. Researchers have shown that there 
is some degree of collective behavior and synchronization in the return of actual 
stocks [?]. 

Macroeconomics situations strongly influence money flows at all the levels of so- 
ciety. Stocks at each sector are traded by investors and traders through the stock ex- 
change markets every minute. Moreover, stock prices are so sensitive to the money 
flows that stock prices of all the sectors depend on demand-supply situations of the 
money by economic actors. Therefore, they are expected to be useful for detecting 
a changes of macroeconomic situations. 

In this study, we hope to provide some insights on the problem on quantifica- 
tion of Japanese macroeconomic situations through a comprehensive analysis of 
stock prices traded in the Tokyo Stock Exchange. In the context of economics and 
finance, there are various methods to segment highly nonstationary financial time 
series into stationary segments called regimes or trends. Following the pioneering 
works by Goldfeld and Quandt |[T2l . there is an enormous literature on detecting 
structural breaks or change points separating stationary segments. Recently, a re- 
cursive entropic scheme to segment financial time series was proposed |[T3l . In fact 
they investigate segments for Japanese stock indices, however, they did not consider 
stock prices themselves. 

A recursive segmentation procedure is applied to analyze security prices of 1,413 
Japanese firms listed on the first section of the Tokyo Stock Exchange. In this paper, 
the number of segments in quintiles in terms of variance is computed in order to 
detect change points of money flows of the Japanese security market. 

This article is organized as follows. In Sec. |2] the recursive segmentation pro- 
cedure is briefly explained. In Sec. [3] the segmentation procedure is performed for 
artificial time series. In Sec. |4] the empirical analysis with daily log-returns for the 
last 10 years is conducted. Sec.[5]is devoted to conclusion. 
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2 Method 



2.1 Segmentation procedure 

Let On an d Ei.t be, respectively, daily opening and closing prices of j-th stock (i = 
1 , . . . ,M) at day t (t = 1 ,...,«). M and n are denoted as the total number of stock 
and the total number of observations. The daily log-return (opening to closing) time 
series x,-. r (z = 1, . . . ,M;t = 1, . . . ,n) is computed as 

xu =log£, ) , -log (1) 

According to the seminal work by Mantegna and Stanley |fl], the log -return time 
series of stock prices are modeled by Levy distributions. Superstatistics suggests 
that a mixture of Gaussian distributions with ^-distributions in terms of variance 
gives a Levy distribution. Therefore, we may assume that each segment is sam- 
pled from a Gaussian distribution with different mean and variance. Namely, it is 
assumed that the log-retum time series consist of m* stationary segments. Each seg- 
ment follows a stationary Gaussian distribution with mean fXij and variance afi 
U = 1) ■■■,m)- 

To find the m, — 1 unknown segment boundaries zy separating segment j and 
7 + 1, the recursive segmentation scheme introduced by Siew et al and Bernaola et 
al ifLTl fl4l . Their segmentation scheme is fundamentally based on the likelihood- 
ratio test under an i.i.d Gaussian null model and a joint consisting two different 
Gaussian models for the total time series. 

Firstly, suppose that there are n observations x s (s = 1, . . . ,«). Let g(x; fJ.,(J 2 ) 
be a Gaussian distribution 



g(x;n,o z )= ^ exp 

V2ncj 2 



(2) 



2(7 2 

with parameters jj. and a 1 . Assuming that the observations x s should be segmented 
at t and that the observations on the left hand side are sampled from g(x;fj,L,al), 
and ones on the right hand side are from g(x; [Ir, (Jg), we define likelihood functions 

U=f[g(x s ;^o 2 ), (3) 

s=l 

t n 

Lz(t) = Y\g(x s ;n L ,al) ]J g(x s ;n R ,o£). 

.v=l .s=r+l 

(4) 

Furthermore, we define the logarithmic likelihood-ratio between L\ and Li{t ) as, 



A(t) =logL 2 (f)-logLi 



(5) 
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Inserting Eq. (f2]i into Eq. (0 we have 

t n n 

A {t) = L 1 °g^^ ; ^' CJ i)+ L lo §^(^;A i «,o-|)-E lo g^( Xs; A i J cj2 )- ( 6 ) 

S=] s=t+l s=l 

By using the approximations 

-Llogg(^;/i,tT 2 ) ps y g(x;^,ff 2 )logg(x;^,ff 2 )djc = --log(27rea 2 J, 

y J^logg(^;/x L ,of) ps J_j{x;v L ,ol)logg(xiiiL,ol)dx = -^log(2neo£), 



1 



: £ logg^jU^ff 2 ) ps / g(*;^,a 2 )logg(x;jUfl,a 2 )dx = --log 27Tea 2 

4 (f ) is rewritten as 

A(t) = nloga-flogcTL - (n—/) logos > 0. (7) 

(7, Gl and o# are further approximated as maximum likelihood estimators given by 
empirical standard deviations 

/I " / 1 " \ 2 

A(t) can be used as an indicator to separate the observations into two parts. An 
adequate way to separate the observations is that segmentation is conducted at t 
where A (t) takes the maximum value. Namely, an adequate segmentation should be 
done at 

t* = arg max A(t). (11) 

If maxzi(f) is less than a threshold value A c , then the segmentation should be ter- 
minated. The hierarchical segmentation procedure is also applied to the time series. 
After segmentation, we also apply this procedure for each segment recursively. In 
order to stop the segmentation procedure we assume that the minimum value A c . If 
maxZi (t) is less than A c , then we do not apply the segmentation procedure any more. 
This is used as the stopping condition of the recursive segmentation procedure. Sta- 
tistical theory tells us that as the sample size t approaches °°, 2A (/) is asymptotically 
X distributed with degrees of freedom equal to the difference between the number 
of alternative model parameters and the number of null model parameters. In this 
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Gaussian case, the degrees of freedom is set as 2 = 4 — 2. The cumulative distribu- 
tion of x 2 (x,2) is given by 

F( x ) = l- eX p(-|). (12) 

Therefore, for the level of significance a(Q < a < 1), the threshold is given as 

A c = -21og(l - a). (13) 

Consequently, for stock i (/ = 1, . . . ,M), we obtain m,- segments and to,- sets of 
Gaussian parameters {fUjiOh} (j = 1,..., *»,-). 



2.2 General case 

More generally, let us define the discriminator A(t) given in Eq. (0. Assume that 
p(x; 0) is a probability density function (model) parameterized by 0. Let 0, 9l, and 
Or be maximum likelihood functions computed by 

n 

= argmax^ logp(x s ; 0), (14) 

6 s=i 
t 

L = argmax Y logp(x,.; 0), (15) 
e s=i 
n 

Or = argmax log/»(x,;0). (16) 

e s=t+l 

where Ql and Or are maximum likelihood estimators in left and right sequences, 
respectively. Assuming likelihood functions 

U =f{p{x s ;6), (17) 
L 2 (t)=f\p(x s -0 L ) fl p(x s ;d R ), (18) 

1=1 s=t+\ 

we can rewrite Eq. (|5) as 

A(t) = —n j p(x;0)\ogp(x;0)dx 

+ tj p(x;Q L )\ogp(x;0 L )dx+(n-t) J p(x;0 R )\Qgp(x;0 R )dx 1 (19) 
A(t)/n is equivalent to 



A(t) 



= H 



p(x;9) 



-H 

n 



p{x;9l) 



n — t 



-H 



p(x;9r) 
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(20) 



where H[p(x; 9)] represents the notation of Shannon entropy defined as 



H 



p(x) 



p(x\9)\ogp(x\ 9)dx. 



(21) 



In this Gaussian case, the degrees of freedom k is set as the dimensionality of 
a set of model parameters 9. The cumulative distribution of % 2 distribution with 
degrees of freedom k is given as the regularized incomplete gamma function, 



1 



f 2 l exp(-t)dt. 



(22) 



The threshold for the terminated condition under a given level of significance a is 
calculated from the regularized incomplete gamma function, such that 



r \2' 2 



a. 



(23) 



3 Numerical study 

Artificial time series are generated from i.i.d standard normal random variables with 
different variances. Time series consisting of 4 segments with different variance are 
generated. The time series in the first segment is sampled from the normal distri- 
bution with mean and standard deviation 1. The second is generated from the 
normal distribution with mean and standard deviation 2. The third is from mean 
and standard deviation 1. The fourth is from mean and standard deviation 3. The 
length of each segment is set as 500. Let x{t) be T = 2,000 independent normal 
random variables with different variances generated by means of the Box-Muller 
algorithm. A c is fixed as 10 (a = 0.99995). 

Fig. Q] shows the time series segmented by the proposed hierarchical procedure. 
The length, mean and standard deviation of each segment is shown in Tab. Q] From 
the table, it is found that both the first and fourth segments are completely deter- 
mined. Both the second and third segments are slightly different from the actual 
setting. The segments determined by the recursive segmentation procedure are ap- 
proximately ones assumed in advance. We confirmed that the proposed procedure 
can determine the position where each variance changed. 
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Fig. 1 The time series consisting of 4 segments. Sequences in each segment are sampled from 
a zero-mean Gaussian distribution. The standard deviation is set as 1,2, 1, and 3 from the left 
segment. The length of each segment is set as 500. 

Table 1 The length, mean and standard deviation of each segment detected by the proposed 
method. 



segment 


start end length 


mean 


std. dev. 


1 


1 500 


500 


-0.00691 


1.000 


2 


501 1017 


517 


0.0524 


1.986 


3 


1018 1500 


493 


0.0320 


0.980 


4 


1501 2000 


500 


0.212 


2.929 



4 Empirical analysis 

1,413 companies listed on the first section of Tokyo Stock Exchange are selected 
for empirical analysis (M — 1,413). The duration is 4th January 2000 to 30 January 
2012. There are 2,675 business days during the observation period (« = 2,675). 
These companies last during the observation period. I apply the recursive segmen- 
tation procedure to 1,413 security prices. I conducted the segmentation analysis of 
daily log-return time series between daily opening and ending prices defined in Eq. 
£[). Throughout the investigation A c is fixed as 10 (a = 0.99995). 

Fig.|2]shows the daily log return time series of Toyota Motor Corp (7203) seg- 
mented by using the proposed procedure. 9 segments are obtained in this case. 
The boundaries are computed in 2000-07-03, 2004-05-10, 2004-09-03, 2005-09-16, 
2008-01-08, 2008-10-03, 2008-12-15, and 2009-04-16. 
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Fig. 2 The daily log return time series of Toyota Motor Corp (7203) during the period from 4th 
January 2000 to 30th January 2012. 

Fig.[3]shows the number of starting dates of segments for 1,413 log-return time 
series. The number of segments increase at (a) June 2000, (b) April 2004, (c) Febru- 
ary 2006, and (d) 2007 to 2009. These seem to correspond to regimes or change 
points on Japanese economy. Specifically, during the latest global financial crisis 
the number of segments tend to increase (about 260 segments can be found at this 
period). Furthermore, after (e) 1 1 March 201 1, the Great East Japan Earthquake, the 
number of segments steeply increased larger than the latest global financial crisis. 

The number of segments belonging to the same quintile of variances is counted. I 
computed order statistics of variance {Oj,(i) < • • • < °/,(mj)} m m < segments of stock 
i. Next, each segment of stock ; is labeled k = {1,2,3,4,5} depending on variance 
which belongs to the quintile. The number of segments which have the same labels 
is counted at each day. Fig.|4]shows the number of segments belonging to each quin- 
tile at every day. The number of first quintile segments shows stability of economic 
affairs, and the number of fifth quintile segments indicates instability of economic 
affairs. It is found that from 2003 to 2007 (I) the Japanese economy was in stable 
regime. From the end of 2007 (II) the unstable regime was observed. Specifically 
September 2008 (III), when we experienced the Lehman shock, the number of fifth 
quintile regimes steeply increased. This implies that the money flows of Japanese 
economy became unstable just after the Lehman shock. From March 2009 (IV) the 
money flow eventually recovered and the number of unstable regimes decreased 
and the number of stable segments eventually increased. From 11 March to 10 
April 201 1 (V), the number of unstable regimes steeply increased due to the Great 
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Fig. 3 The number of staring dates of segments for 1,413 log-return time series during the period 
from 4th January 2000 to 30th January 2012. 



East Japan Earthquake. However, after that the number of stable regimes rapidly 
increased and the Japanese macroeconomic affairs have been recovered eventually. 

Normally, we use only the past time series and want to foresee future macroeco- 
nomic conditions. To do so, at least the obtained results should be robust. In order 
to confirm the robustness, stability of the number of each regime, we consider two 
different time periods. We use two time periods for 2000-2010 (estimation) and for 
2000-2012 (realization) and compare their results. The second is the same time se- 
ries as ones shown in Fig. [4] Fig.[5]shows the number of regimes belonging to each 
quintile for two different periods. As seen from them it is confirmed that they are 
almost same until 2010. Namely, the time series may be employed to foresee future 
macroeconomic conditions. 



5 Conclusion 



A comprehensive time series segmentation analysis of the Japanese security prices 
was conducted. The daily log-returns for 1,413 security prices listed on the first 
section of the Tokyo Stock Exchange for 4 January 2000 to 14 December 2010 were 
analyzed by using a recursive segmentation procedure based on Jensen-Shannon 
divergence. 
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Fig. 4 The number of segments in each quintiles for 1,413 log-return time series during 4th January 
2000 to 30th January 2012. 

It was found that the number of segments increase at (a) June 2000, (b) April 
2004, (c) February 2006, (d) 2007 to 2009, and (e) March 2011. These seemed to 
correspond to regimes or change points on Japanese economy. 

The number of segments belonging to the same quintile of variances was counted. 
It was found that from 2003 to 2007 (I) the Japanese economy was in stable regime. 
From the end of 2007 (II) the unstable regime was observed. Specifically September 
2008 (III), when we experienced the Lehman shock, the number of fifth quintile 
regimes steeply increased. This implies that the money flows of Japanese economy 
became unstable just after the Lehman shock. From March 2009 (IV) the money 
flow eventually recovered and the number of unstable regimes decreased and the 
number of stable segments eventually increased. After 1 1 March 201 1 (V), the Great 
East Japan Earthquake, the number of segments steeply increased. From 1 1 March 
to 10 April 2011, the macroeconomic situations seemed to be negative, however, 
they were eventually recovered after that. 

The volatility distribution computed from daily log-returns for a broad spectrum 
of stock prices traded in the Tokyo Stock Exchange market provides us with in- 
formation on money flows at several economic levels. It may be further useful to 
understand the macroeconomic affairs of Japan in the quantitative way. 
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Fig. 5 The number of segments in each quintiles for 1,413 log-return time series during 4th January 
2000 to 14th December 2010. 
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